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ON CONSTRUCTION OF THE SMALLEST ONE-SIDED 
CONFIDENCE INTERVAL FOR THE DIFFERENCE OF TWO 

PROPORTIONS 

By Weizhen Wang^ 

For any class of one-sided 1 — a confidence intervals with a certain 
monotonicity ordering on the random confidence limit, the smallest 
interval, in the sense of the set inclusion for the difference of two 
proportions of two independent binomial random variables, is con- 
structed based on a direct analysis of coverage probability function. 
A special ordering on the confidence limit is developed and the cor- 
responding smallest confidence interval is derived. This interval is 
then applied to identify the minimum effective dose (MED) for bi- 
nary data in dose-response studies, and a multiple test procedure that 
controls the familywise error rate at level a is obtained. A general- 
ization of constructing the smallest one-sided confidence interval to 
other discrete sample spaces is discussed in the presence of nuisance 
parameters. 

1. Introduction. We first focus on an important case, the binomial dis- 
tribution. Let X be a binomial random variable with n trials and a proba- 
bility of success pi, denoted by Bin(n,pi), and let Y be another indepen- 
dent Bin{m,pQ). Their probability mass functions and cumulative distri- 
bution functions are denoted by px{x;pi,n), pY{y;po,m), Fx{x]pi,n) and 
FY{y;po,'m'), respectively. The goal of this paper is to construct the optimal 
one-sided 1 — a confidence interval of form [L{X,Y),1] for pi — po and to 
discuss its application and a generalization to other discrete sample spaces. 
This type of interval is important when one needs to establish that pi is 
larger than pQ by a certain amount. An immediate application is in a clin- 
ical trial where the goal is to determine if a treatment is "better" than a 
control with binary responses. 

There are two general criteria used to evaluate the performance of a con- 
fidence interval: 
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(i) The accuracy: maintain the coverage probabihty function of the con- 
fidence interval at least 1 — a, that is, 

(1) P(p^,p„)(L(X,y)<pi-po<l)>l-a Vpi,poG[0,l]. 

Any interval satisfying (1) is called a one-sided 1 — a confidence interval for 
Pi—po- In general, there is no disagreement on criterion (i). When it is hard 
to implement (1), an approximate 1 — a confidence interval is employed. 

(ii) The precision: minimize the "size" (e.g., the length) of the confidence 
interval within a certain class of intervals. 

Researchers do have different opinions on how to define an interval with 
the "minimum size." For two 1 — a confidence intervals, Ci{X,Y) and 
C2{X,Y), the most natural way to claim Ci{X,Y) no worse than C2{X,Y) 
is that 

(2) Ci{x,y) is a subset of C2{x,y) for all sample points {x,y). 

We call this the set inclusion criterion, proposed in Wang (2006), and an 
equivalent version was given in Bol'shev (1965). The superiority of Ci over 
C2 is easy to check because no expectation computation is involved. Under 
this criterion, for a specified class of 1 — a intervals, we should search for 
the smallest 1 — a confidence interval, which is equal to the intersection of 
all intervals in the class, provided that the intersection also belongs to the 
class. 

For the case of the one-sided interval, the smallest interval in a class is 
equivalent to the shortest interval which has the shortest length on all sample 
points in that class. For the case of the two-sided interval, the smallest 
implies the shortest; however, the shortest does not necessarily imply the 
smallest. Also the smallest interval has a clear interpretation. Therefore, we 
use the terminology, "the smallest interval." 

The existence of the smallest interval depends on the class of intervals 
from which we search for the smallest. In this paper, we propose two restric- 
tions on the class: 

(a) one-sided 1 — a confidence interval; 

(b) a certain monotonicity on the confidence limit L{X,Y) for all sample 
points. 

We will show the existence of the smallest interval and give its construction 
under these two restrictions. There were successful efforts for the case of a 
single proportion pi based on one observation X where there exists a natural 
ordering on the lower limit L{X): L{xi) < L{x2) if xi < X2, and there is no 
nuisance parameter. The smallest one-sided 1 — a confidence interval for pi 
was derived independently by Bol'shev (1965) and Wang (2006). However, 
when there exists nuisance parameter, the result on the smallest one-sided 
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confidence interval is very limited. Bol'shev and Loginov (1966) partially 
generalized Bol'shev's method (1965) to the case with nuisance parame- 
ter(s). Their construction is based on some function of X, Y and pi — po 
but not under condition (b). As we show later, for different orderings on 
L{X,Y), the corresponding smallest intervals are different. So the interval 
following Bol'shev and Loginov's method for pi —po cannot be the smallest. 

The confidence interval was proposed by Laplace in 1814, and its defini- 
tion only involves the coverage probability as shown in (1). However, the 
interval construction based on the coverage probability is not among the 
major methods currently used in practice. For example, Casella and Berger 
(1990), summarized five methods for the construction: the inversion of a 
family of tests; the pivotal quantities; a stochastic nonincreasing (or non- 
decreasing) distribution family with a single parameter, Bayesian intervals; 
invariant intervals, etc. The first is a general but indirect method because 
of inverting tests. During the inversion process, it is not easy to see how the 
interval is obtained. The other four need assumptions on the distribution 
under the study. For example, the second assumes the existence of pivotal 
quantities which is not true for binomial distributions. Being a major statis- 
tical inference procedure, the confidence interval deserves a direct method 
which is based on the analysis of coverage probability and needs mild or no 
assumptions on the distribution for its construction. However, it does need 
an assumption, restriction (b), on the sample space. The development of 
such a method is one goal of our paper. More importantly, this method can 
generate the smallest interval in many classes of intervals in the presence of 
nuisance parameter (s). 

The rest of the paper is organized as follows. In Section 2, we specify ap- 
propriate classes of intervals, and in each class the smallest one-sided 1 — a 
confidence interval for pi — pQ is constructed. In Section 3, we carefully 
identify a special class of interval and then derive the corresponding small- 
est interval. An example is given to illustrate the procedure. In Section 4, 
we apply the interval in Section 3 to detect the minimum effective dose 
(MED), an important issue in clinical trials. A step-down test procedure is 
obtained with the family wise error rate controlled at level a. In Section 5, 
we generalize the construction on the smallest one-sided confidence interval 
to other discrete sample spaces, and a Poisson distribution example is dis- 
cussed. Some closing remarks are given in Section 6. A confidence interval 
with confidence level is of no interest, so we assume < a < 1 throughout 
the paper. 

2. The smallest one-sided confidence interval. Recall X ~ Bin{n,pi) 
and Y ~ Bin{m,po). Let A=pi— po be the parameter of interest and pq be 
the nuisance parameter. Let 

(3) S = {z = {x,y) '.O < X < n,0 < y < m,x and y are integers} 
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be the sample space. We use z and (x, y) interchangeably. Also the parameter 
space a = {(pi,po) : < pi,po < 1} can be rewritten as 



The interval class that contains the smallest one-sided interval for A is given 
below. 

Definition 1. For any given ordered partition {Cj}j2=i of 5, define a 
class of one-sided 1 — a confidence intervals for A; 

B = {[L(Z), 1] : L(z) is constant on Cj 

and L[z) > L{z') for z G Cj,z' G Cj+i,Vj}. 

Since L{z) is a constant on Cj, we define L{Cj) = L{z) for any z G Cj. 

Remark 1 . Any given ordered partition of S defines an ordering on the 
lower confidence limit. So we say: search for the smallest one-sided 1 — a 
confidence interval that is monotone with respect to the ordered partition 
{^j}^=ii oi' simply search for the smallest interval under the ordered par- 
tition. On the other hand, an ordered partition can be obtained by any 
given function L{Z) as follows. Let {/j}j2=i be a sequence of strictly de- 
creasing numbers in j that contains all possible values of L{Z). Then define 
Cj = {z(^S: L{z) = Ij} for 1 < i < ko. 

Definition 2. A confidence interval [Ls{Z),l] in B is the smallest if 
it is a subset of any intervals in B; that is, for any [L(Z),1] in B, L{z) < 
Ls{z),\/z G S. 

Definition 2 is adopted from Wang (2006). The smallest interval, if it 
exists, is the best in the strongest sense, and automatically minimizes the 
false coverage probability and the expected length among all intervals in B. 
Next, we prove the existence and provide the construction of the smallest 
interval in B. 

Theorem 1. Assume a G [0,1). For a given ordered partition {Cj}j2=i 
of S and any z £ Cj, let 



(4) 



H = {(A,po) -Po G D{A) for each A G [-1, 1]} 



where 




if AG [0,1], 
if AG [-1,0). 



/.(A) 



PoeD{A) 



min P(5n 



(6) 



mm 
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= {A G [-1, 1] :/j(A') > 1 - a,VA' < A}. 



(8) 




sup G; 

-1, 



otherwise. 



Then: 



(1) [Ls{Z),l]€B; 



(2) is the smallest in B. 

Remark 2. As pointed out in Corollary 1 in Wang (2006), the lower 
limit L{x,y) is "the smallest 9 (0 = A in our case here) where p{x;6) (= 
Px{x; n, A-|-po)py (y; "i,po) in our case here) is used to compute the coverage 
probability." Therefore, in order to obtain the largest L{x,y), we should 
exclude the term px{x;n, A +pQ)pY{y;m,pQ) from the coverage probability 
when A is as large as possible while keeping the coverage probability at 
least 1 — a. This is achieved by implementing (6)-(8) where (6) provides the 
minimal (respect to po) coverage probability (as a function of A) for the 
desired interval up to Ls{z), (7) assures its coverage probability no smaller 
than I — a and (8) guarantees that Ls{z) is the largest. 

Remark 3. Note that /j, Gz and Lg, defined in (6)-(8), depend on z 
through Cj. Let Ij = Ls{z) for z £ Cj. 

Proof of Theorem 1. To prove (1), first, it is clear that Ls{z) is a 
constant on each Cj due to Remark 3. Secondly, suppose Ls{z) < Ls{z') for 
z £ Cj and z' G Cj+i for some j. Then Ls{z') > —1. Notice 



So Ls{z') G Gz due to (7) and Ls{z') < Ls{z) due to (8). A contradic- 
tion is constructed. So Ls{z) > Ls{z') for z G Cj and z' G Cj+i for all j. 
Thirdly, consider the coverage probability function of [Ls{Z), 1]: /i5'(A,po) = 
P{Ls{Z) < A). Let 



We need to show /i(A) > 1 — a on [—1, 1]. Note that [—1, 1] is partitioned as 
[li, 1] U (Uj=L^['i+i'^j)) where Ij is given in Remark 3. For A G [Zi, 1], notice 
Ls{z) < A for all z£S. Then 



fj{A) > /,+i(A) > 1 - Q VA < Ls{z'). 



h{A) 



min hs{A,po). 

Poe-D(A) 



h{A) 



min P(S) = l>l-a. 
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For A G for any 1 < j < fco — 1, notice Ls{z) < A for any z € 5"?. 

Then 

/i(A)> min P(5,") = /,(A)>l-a 
Poe£'{A) 

by (7). Thus [Ls{Z),l]eB. 

To prove (2), suppose [Ls{Z), 1] is not the smallest. Then there exists an 
interval [L*{Z), 1] in B and a point z* G Cj for some j with 1/5(2;*) < L*{z*). 
Let I* = L*(2:) for z G Cj for i = 1, . . . , ko. Then < /*. Let h*{A,po) be the 
coverage probability of the interval [L*{Z), 1]. For any A G / = [IjJ*) (not 
an empty interval), we have 

(9) 1 - a < r (A,po) = PiL*iZ) < A) < P(S|+i) < P(5|). 

The second inequality holds because {z:L*{z) < A} is contained in Sj^^ 
when A G /. Therefore, /j(A) = min^^g^j^^) P(5|) > 1 — a on interval / 
which contradicts (8). □ 

Proposition 1. For any one-sided 1 — a confident limit L{Z) with < 
a <l, 

(10) minL(z) = -l. 

Proof. Suppose c = min^gs L{z) > — 1. Pick a point (Ao,poo) = ((—1 + 
c)/2, 1) in the parameter space. Note Aq < L{z) for any z £ S; then 

P(Ao,poo)(^(^)<^o)=0<l-a 

which contradicts the fact that [L{Z), 1] is of level I — a. □ 

Example 1. Consider the case of n = 4,m = 1 and a predetermined 
ordered partition of S given by the well-known z-test statistic, 

Z{x,y) = ^1-^0 

V^pi(l -pi)/?i + Po(l -Po)/m 

following Remark 1 where pi = x/n and po = y/m, and 0/0 *== 0, -|-/0 *== 

00 and — /O =^ —00. Then this ordered partition {C^^} and its associated 
smallest 95% confidence interval [LzT{Z),'i] are reported in Table 1. 
For the purpose of illustration, we determine ^^^(3,0) here. Consider 

/2(A) = min (l-px(4;4,A+po)py(0;l,Po) 
Po6D(A) 

-px(3;4,A+po)py(0;l,Po))- 
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Since /2(— 1) = 1, Lzt{^iO) is equal to —0.345, the smallest solution of 
/2(A) = 1 — a with a = 0.05. This can be done numerically by calculat- 
ing /2(A) at each A in the order of A = -1,-0.999,-0.998,... with an 
increment of 0.001, for example, until /2(A) is always greater than 1 — a. 
Therefore, the last value of A = —0.345 is the smallest solution. 

It is well known that a 1 — a confidence interval can generate a family of 
level-a tests and vice versa. Interval [Ls{Z),l] in Theorem 1 can be used 
for this purpose for the following family of hypotheses: 

(11) Ho{6):A<6 vs. HAiS):A>6, 
where 5 G [—1,1]. For any given 6, the rejection region, 

(12) Rs{S) = {zeS:Ls{z)>6}, 

defines a level-a test for (11). For the ordered partition {Cj}^!Li of l^t 

j{5)=max{j:Ls{Cj)>5}; 
or j{6) = 0, if Ls{Ci) < 6. Then 

Rs{S) = U 

due to Ls{Cj+i) < Ls{Cj) and the definition of j{S). On the other hand, 
[Ls{Z), 1] can also be obtained by inverting tests as follows. For an ordered 
partition {Cj}^2=n consider a level-a rejection region of form U j=i 



Table 1 

Different ordered partitions and their associated smallest 95% intervals when n — A and 

m — 1 



3 


Cf 


Z 


Lzxiz) 


j 




LAiCf) 


Lz{z) 


3 




Lj{z) 


1 


(4, 0) 


00 


-0.095 


1 


(4, 0) 


1 


-0.095 


1 


(4, 0) 


-0.095 


2 


(3, 0) 


3.464 


-0.345 


2 


(3, 0) 


0.394 


-0.345 


2 


(3, 0) 


-0.345 


3 


(2, 0) 


2.000 


-0.562 


3 


(2, 0) 


0.089 


-0.562 


3 


(2, 0) 


-0.562 


4 


(1,0) 


1.155 


-0.756 


4 


(4, 1) 





-0.950 


4 


(1,0) 


-0.756 


5 


(4, 1) 





-0.950 




(0, 0) 







5 


(4, 1) 


-0.757 




(0, 0) 







5 


(1,0) 


-0.106 


-0.950 


6 


(3, 1) 


-0.770 


6 


(3, 1) 


-1.155 


-0.950 


6 


(3, 1) 


-0.606 


-0.950 


7 


(2, 1) 


-0.902 


7 


(2, 1) 


-2.000 


-0.950 


7 


(2, 1) 


-0.911 


-0.950 


8 


(0, 0) 


-0.950 


8 


(1, 1) 


-3.464 


-0.987 


8 


(0, 1) 


-1 


-1 


9 


(1, 1) 


-0.987 


9 


(0, 1) 


— CXD 


-1 


5 


(1, 1) 


-1.106 


-1 


10 


(0, 1) 


-1 
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some nonnegative integer s{6) for hypothesis Hq{6) given in (11) with a 
fixed 5 where 

(13) s{6)=maxln<ko: sup -P(lJ'^jl-"r- 



[ {A,po)eHo(S) 



Vj=l 



So {[jf^iCjY, the complement of Uj=iC'j) is the acceptance region. For a 
sample point Z = z, let 

(14) Ct{z) = !^6e [-1,1]: ze ([j^^^ }• 

Then [Ls(Z),l] equals Ct{Z) as shown in Theorem 2 below. However, we 
lose the intuition given in (6)-(8) during this inversion process. 



Theorem 2. Ct{Z) belongs to the interval class B given in Definition 
1, and 

(15) [Ls{Z),1]=Ct{Z). 



Proof. First, for any sample point z, if A G Ct{z) and A' G [A, 1], then 
s(A') < s(A) following (13) and (14). Therefore, A' G Ct{z), and Ct{z) is 
a confidence interval for A. Second, the coverage probability of Ct{Z) 



P{A G Ct{Z)) = P(z : a G Ct{z)) = P 




following (13) for any given {A,pq). So Ct{Z) is of level 1 — a. Third, let 
Ct{z) = [Lt{z),1]. (i) It is clear that Lt{z) is constant on each C^; (ii) 
Pick zi G Cj, any 5i G Ct{zi) and Z2 G Cj+i. Since zi G (Uj=i^^j)'^' 

have s((5i) < i. Thus s((5i) < i + 1 and Z2 G (Uj=i^ ^i)'^' ^^"^ conclude 
that (5i G Ct{z2)- Therefore, LxiCi) is nonincreasing in i for 1 < i < /co- 
So Ct{Z) belongs to the interval class B. This implies [Ls{Z),l] C Ct{Z) 
following Theorem 1. 

Now we only need to prove 

(16) [Ls{Z), 1] D Ct{Z){= [Lt{Z), 1]). 

Without loss of generality, assume Ls{Cj) is strictly decreasing in j. Oth- 
erwise we redefine a new ordered partition {Cj}, by merging those Cjs on 
which Ls{Cj) does not change so that Ls{Cj) is strictly decreasing. Suppose 
(16) is not true. Then let jo be the smallest positive integer so that LxiCj^) < 
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LsiCj,). Pick So e {max{LT{Cj,),Ls{Cj,+i)},Ls{C,,)). Note [jff^l^ C, 
{z:LT{z)<6or = [jfS,'C,. So 

(17) s(^o)=Jo-l. 
On the other hand, 

l-a< inf < A) < inf P{Ls{Z) < 60) 

{A,po)eHo(<5o) (A,po)e-f/o(5o) 



1 - sup P ( Q Cj 



(A,po)Gi?o(<5o) 



Vi=l 



Due to (13), s{6o) > jo, which contradicts (17). Then, (16) is true, as weh 
as (15). □ 

Remark 4. When applying Theorem 1, our intention is not to generate 
the optimal interval among all possible orderings, but to improve or modify 
a given interval [L{Z), 1] which has a level 1 — a or approximately 1 — a, to 
be the smallest 1 — a interval. To achieve this, one forms an ordered partition 
for S following Remark 1 using the given function L{Z), then derives the 
smallest interval [L5(Z),1] following Theorem 1. 

Example 1 (Continued). The most commonly used one-sided interval 
for A in practice is the following z-interval: 



(18) [La{z),1] =^ [pi -po - Za^/pi{l-pi)/n + po{l-po)/m,l\, 

where Za is the upper ath percentile of the standard normal distribution. 
Its coverage probability can be much less than the nominal level 1 — a. We 
follow Remark 4 to modify this interval by generating an ordered partition of 
5, denoted by {C?}. Then the smallest I — a interval, denoted by [Lz{Z), 1], 
based on this partition is derived following Theorem 1 and is reported in 
Table 1 for the case in Example 1. Note Cf = (1, 1) instead of (0,1) which 
is intuitively incorrect. Therefore, interval [Lz{Z),l] is not recommended. 



Is it possible to improve the smallest interval from a given ordered parti- 
tion? Yes, especially when there exists a finer partition than the given one 
as stated below. See such an example in Table 1 where Lj{Z) > Lzt{Z) > 
Lz{Z). Each Cj, given in the second-to-last column of Table 1, contains a 
single sample point which implies that {Cj} is a finest partition of S. 

Proposition 2. For two ordered partitions V = {C-,}J2=i and V* = {0*}^°^^ 
of the sample space S, suppose each Cj is a subset of C*, •-, for some 
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where is a nondecreasing function in j (i.e., V is a finer partition than 
V*). Let [Ls{Z),l] and [Lg{Z),l] be the smallest 1 — a confidence intervals 
under ordered partitions V and V* , respectively. Then 

(19) L*s{Z)<Ls{Z) Vzg5. 

Proof. The proof is trivial if one notices that any ordering on L{Z) by 
V* is also an ordering by V. Then the claim follows Theorem 1. □ 

3. An ordering on the confidence fimits. Which ordered partition of S 
provides an interval that cannot be uniformly improved? Roughly speaking, 
by Theorem 1, we prefer an ordering on L{z) (= L{x,y)) 

(20) that yields a large smallest solution of /j(A) = 1 — a for all js. 

Due to Proposition 2, each set in the partition would contain only one point. 
Because of the specialty of binomial distributions, L{x,y) should satisfy: 

(1) L(xi, y) < L(x2, ?/) for xi < X2; and 

(2) L{x,yi) > L{x,y2) for yi < 2/2- 

Let Bb denote the class of all one-sided 1 — a intervals for A satisfying (1) 
and (2). We will search for optimal intervals, perhaps admissible ones, from 
Bb in this section. 

It is clear that L(n, 0) must be the largest among all L{x,y)s and the 
second largest L{x,y) should be achieved at either (n — 1,0) or (n, 1) or 
both (if n = m). We, by induction, construct an ordered partition of S, 

denoted by {Cj}j'Li, that satisfies (1) and (2) and starts at point (n, 0) as 
follows: 

Step 1: Let Ci = {(?i,0)}, mi = 1 and mo = because L(n,0) is the 
largest among all L(x,y)s. So = {{xi,yi)}'^,^^^_^_^ where (3:1,^1) = (n,0). 

Step 2: Suppose, by induction, {C^}^^^ are available for some positive 
integer k where 

= {(^^ii ?/j)}i=m.j_i+l 

for some nonnegative integers ttiq, mi, . . . , m^ satisfying: 
(I) L{x,y) is constant on C^; 

(II) L{Xmj_j,ymj^i)> L{x 

m, )ym, ) for each j ^k. 

Now we determine C^j^. Let = [Jj=iCf and let Nf^ be the "neighbor" 
set of Sk, that is, 

Nk = {{x,y) eS:{x,y)^ Sk] {x + l,y) G Sf, or {x,y - 1) E 5/,}. 
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Due to (1) and (2), some points in A''^ are disqualified to be in C^^. To 
exclude these points, let NCk be the "candidate" set within A''^ satisfying 

(21) NCk = {ix,y)£Nk:ix + l,y)(^Nk and (x, y - 1) ^ TYJ. 

Therefore, C^j^ must be a subset of NCk, a-nd a point selected from NC^ 
automatically guarantees (1) and (2). For each point zq = (xo,2/o) in NC^, 
consider 

/,„(A)= min P(({zo}USfcr) 
poeZ)(A) 

= min px{x;pq + /\,n)pY{y]Po,m). 

Let 

(22) E,, = {A e [-1, 1] : /,„(A') > 1 - a,VA' < A} 



and 

(23) Lo(zo) 
Define 



fsupE'^o, if£;^,j/0, 
I —1, otherwise. 



(24) Cf,i = |zGiVC7fc:Lo(z)= max Lo{zq)\ and 



(25) ruk+i = iTT'k + the number of elements in Cf,+i. 

Note that Cf , , may contain more than one point especially when 



n = m. 



By induction, an ordered partition {C = {zi}^J^._ , i},!^ for S with some 
positive integer is constructed. Therefore, the smallest one-sided 1 — a 
confidence interval under this ordered partition, denoted by [Ls{Z),l], is 
constructed for estimating A following Theorem 1. 



Remark 5. Ez^, and Gz {fzo and fj, Lq and Ls) are defined in a 

similar way. From (24), the ordered partition {C^ }^-2=i tends to yield a 
large Lo{z), which results in a short interval compared with other parti- 
tions. More precisely, Ls{C^) equals the largest possible value provided that 
LsiCf),...,LsiCf_ ^) are determined. However, different from {Gz, fj,Ls) 
that depends on z through G^, [Ez^, fzo,Lo) depends on each individual 
zq. If G^ always contains a single point for any j, then, for z S G^ , Ls{z) 
equals the largest of Lo{zo)s in the previous step, and we have the following 
result. 
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Proposition 3. For ordered partition {Cj} and interval [Ls{Z),l] 
constructed in steps 1 and 2, if each contains only one sample point (i.e., 
ruj = ruj-i + 1 for all js), then [Ls{Z), 1] is admissible in Bb- That is, for 
an interval [L{Z), 1] G Bb, if Ls{z) < L{z) for any z £ S, then Ls{z) = L{z) 
for any z £ S. 

Proof. Suppose the claim is not true. Note each containing only 
one point, and let jo be the smallest positive integer so that Ls{C^^) < 
L{C^^). Let {Cj} be the associated ordered partition for [L(Z),1]. Then 
= Cj for any j < Jq. Therefore, Cj^ is a subset of NCjg-i given in (21) 
due to conditions (1) and (2). Noting (24), we conclude Ls{Cfj > L{Cf^). 
A contradiction is constructed. □ 

Conditions (1) and (2) were first proposed in Barnard (1947) and called 
the "C" condition. He constructed an optimal rejection region for a hypoth- 
esis testing problem, 

i?o(0):A<0 vs. iJA(0):A>0, 

a special case of (11) for 5 = 0, using a special ordering on 5. This or- 
dering satisfies conditions (1) and (2), and the corresponding ordered par- 
tition {Cj}^2=i is generated by induction starting at Ci = (n,0). Also, for 
given Ci,...,Cjo_i with a positive integer jo (<A;o)i Cjo chosen so that 
suppgg£)(o) -P(UjLi Cj) is minimized. So Barnard's ordering is similar to ours, 
except that he focused on A = 0, but we deal with all A E [—1, 1]. Pointed 
out by Martin Andres and Silva Mato (1994), Barnard's test is the (overall) 
most powerful existing test for comparing two independent proportions. 

Example 1 (Continued). Now construct the smallest 95% confidence 

interval with partition {C^}-°^^. First = {(4,0)} following step 1, and 
L5(4,0) = -0.095 by solving 

/i(A)= min (l-px(4;4,A+po)py(0;l,Po)) = 0.95 
{po6Z)(A)} 

because fi now is nonincreasing in A. In step 2, A'^i, the neighbor set of 
Si{= Cf ), is equal to {(3,0), (4, 1)}, and NCi = Ni. Following (23), 

Lo(3,0) = -0.345, Lo(4,l) = -0.527. 

Thus C2 = {(3,0)} by (24). In step 3, three sets are needed, ^2, N2 and 
NC2, and are given below in the sample space S. Note here NC2 7^ 
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X 






V 





1 


2 


3 


4 


1 










N2, NC2 









N2, NC2 


S2 


S2 



Again, for each point in NC2, we have Lo(2,0) = -0.561, Lo(4, 1) = -0.527, 
following (23). Then = {(4, 1)} by (24). The rest of the interval construc- 
tion is given in Table 2. Following Remark 5, since each contains a single 
point, Ls{z) on is equal to the largest Lo{z()) in the previous step and is 
reported in the last column of Table 2, and the construction is complete at 
the 10th (= kg ) step. This interval is admissible in Bb due to Proposition 3. 
However, if compared with interval [Lj{Z), 1] in Table 1, neither uniformly 
dominates the other. 

4. Identifying the minimum effective dose. Suppose we have a sequence 
of independent binomial random variables Xi ~ Bin(ni,pi) for i = 1,. . . ,k 
and Y ~ Bin{m,pQ). The goal here is to identify the smallest positive inte- 
ger iQ so that pi> po + 6 for any i G [ioj ^]) where 6 is some predetermined 
nonnegative number. Each pi is the proportion of patients who show im- 
provement using a drug at dose level i. A large i associates with a large dose 



Table 2 

The details of the construction of partition {Cf}'ijl^ when n = 4 and m = 1 



3 


cf 




NCj, La{zo) 




1 


(4,0) 


(3,0), (4,1) 


(3,0), (4,1) 


-0.095 








-0.345, -0.527 




2 


(3,0) 


(2,0), (3,1), (4,1) 


(2,0), (4,1) 


-0.345 








-0.561, -0.527 




3 


(4,1) 


(2,0), (3,1) 


(2,0), (3,1) 


-0.527 








-0.578, -0.752 




4 


(2,0) 


(1,0), (2,1), (3,1) 


(1,0), (3,1) 


-0.578 








-0.757, -0.752 




5 


(3,1) 


(1,0), (2,1) 


(1,0), (2,1) 


-0.752 








-0.770, -0.902 




6 


(1,0) 


(0,0), (1,1), (2,1) 


(0,0), (2,1) 


-0.770 








-0.950, -0.902 




7 


(2,1) 


(0,0), (1,1) 


(0,0), (1,1) 


-0.902 








-0.950, -0.987 




8 


(0,0) 


(0,1), (1,1) 


(1,1) 


-0.950 








-0.987 




9 


(1,1) 


(0,1) 


(0,1) 
-1 


-0.987 


10 


(0,1) 






-1 
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level, and po is the proportion for the control group. Then Iq is called the 
minimum effective dose (MED). Finding the MED is important since high 
doses often turn out to have undesirable side effects. 

Typically, the MED is to be found when Xi follows a normal distribu- 
tion with the comparison in proportions replaced by that in means. Thus, 
the assumption of normality is an issue to be addressed. See, for example, 
Tamhane, Hochberg and Dunnett (1996), Hsu and Berger (1999), Bretz, 
Pinheiro and Branson (2005) and Wang and Peng (2008) for results under 
this setting. Now we search for the MED with a binary response without 
such concern on the distribution; see Tamhane and Dunnett (1999). 

A sequence of hypotheses can be formulated to detect the MED as follows: 

Hoi : mm{pj - po} < 6 vs. 

j>i 

(26) 

HAi--in[im{pj - Po} > 6, for i = 1, . . . ,k, 

which is similar to the one in Hsu and Berger (1999), page 471. It is clear 
that the MED equals the smallest i for which H^i is true. Also Hoi is 
decreasing in i, thus C = {Hoi -.1 = 1, ... ,k} is closed under the operation of 
intersection. Suppose a level a nondecreasing (in i) rejection region i?j for 
Hoi is constructed. Then, for the multiple test problem for testing all null 
hypotheses in C, define a multiple test procedure: assert HAi if Ri occurs. 
This procedure controls the familywise error rate at level a following the 
closed test procedure by Marcus, Peritz and Gabriel (1976). 

Now we apply the interval derived in Section 3 to obtain a level a test 
for Hoi. Let Ls,i{Xi,Y) be the smallest one-sided 1 — a confidence interval 
for Pi — pq obtained in Section 3 before Remark 5. Define a rejection region 
for Hoi-. 

(27) Ri = \{xi,...,Xk,y): min {Lsj{xj,y)}> 6}. 

Theorem 3. The rejection region Ri is nondecreasing in i and is of level 
a (<1) for Hqi. Therefore, the multiple test procedure, which asserts not Hqi 
(i.e., asserts Hai) if Ri occurs for any H^i G C, controls the familywise error 
rate at level a. 

Proof. First, it is trivial that Ri is nondecreasing in i. Secondly, for any 
(pi, . . . ,Pk,Po) G Hoi, there exists an i* G [i, k] satisfying pi* — po < 6. Then 

Pip,*,Po)i^s,i*{Xi'',Y) > S) < P(j,.^^p^){Ls,i*{Xi*,Y) > Pi* - Po) when - 
Po < 6. Since p(,)(Ls,i*(Xi*,y) > pi* - po) < a, we have 

Pipu...,Pk,Po)iRi) < Pi.p^,^p,){Ls,i*{Xi,,Y) >6)<a. 

The rest of the theorem follows the closed test procedure by Marcus, Peritz 
and Gabriel (1976) because Ri is nondecreasing and Hoi is decreasing in i. 
□ 
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Remark 6. The multiple test procedure with rejection regions {Ri}'^^i 
in (27) is equivalent to the following step-down test procedure. 

Step 1. If does not occur, conclude that the MED does not exist and 
stop; otherwise go to the next step. 

Step 2. If Rk-i does not occur, conclude the MED = k and stop; other- 
wise go the next step. 

Step k. If Ri does not occur, conclude the MED = 2 and stop; otherwise 
conclude the MED = 1 and stop. 

Remark 7. The proposed multiple test procedure is valid without the 
assumption pi < p2 < • • • < Pk- 

5. A generalization. Suppose a random vector X is observed from a 
discrete sample space S with either finite or countable sample points, that 
is, S = {xj}^^^ where — oo < a <b < +oo. An ordered partition {Cj}j^^ on 
S is given for some — oo <c<d< +oo. The probability mass function of 2L 
is given by p{x;6) where 6 is the parameter vector belonging to a parameter 
space 0, a subset of R''. Suppose 9 = [0,7]) and 

e = {^:r/eL>(6') for each 9£ [A,B]}, 

where is a given interval in R^{A and B may be ±00, and the interval 

is open when the corresponding ending is infinity), and D{6) is a subset of 
depending on 9. Now we are interested in searching for the smallest 
one-sided 1 — a confidence interval of form [L(X),B] for 9 under the ordered 
partition {Cj}^^^, i.e., L{x) is constant on each Cj and L{x) > L(x') for any 
X G Cj and x' € Cj/ for any j < j'. 

Theorem 4. Assume a S [0, 1). For a given partition {Cj}^^^ of S and 
any xGCj, let 

where Sj = IJi=c ^'^^ 

(29) GG^ = {9e [A,B]:fj{9')>l-a,W<9}. 
Define 

(30) Lg(x) = |^7^^^' 'fGG^^0, 
^ ' ^—'[A, otherwise. 

Then [La(X), B] is the smallest one-sided 1 — a confidence interval under 
partition {Cj}j^^. 
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The proof is similar to Theorem 1 and is omitted. 

Example 2. Suppose one is interested in the difference, A, of two 
means, Ai and A2, of two independent Poisson random variables, X and 
Y. The sample space S and the parameter space are 

S = {{x,y) : X and y are nonnegative integers} and 

H = {(A, A2) : A2 G [0, +00) if A E [0, +00); A2 G [-A, +00) if A G (-00, 0)}, 

respectively. One-sided 1 — a confidence intervals of form [L{X, y), +00) for 
A are of interest. Let Fp{x;X) be the cumulative distribution function of a 
Poisson distribution with mean A. 

Different from the binomial case, there exists no fixed sample point on 
which L(x, y) is the largest or the smallest. Lack of a starting or an ending 
point, a construction of an ordered partition {Cj}j^^ of S by induction is 
difficult. Instead, we show how to improve a naive interval [Li{X,Y),oo) 
given below for A. 

This interval is obtained combining two smallest one-sided intervals [L{X), 
+00) for Ai and [0, [/(K)] for A2, both of level \/l — a. Following Bol'shev 
(1965), L{x) satisfies 

(31) 1 - Fp{x - 1; L{x)) = 1 - y/l-a for x > and L(0) = 0, 
and U{y) satisfies 

(32) Fp{y-u{y)) = l-VT^. 

Then [Li{X,Y),oo), where Li{X,Y) = L{X) - U(Y), is a one-sided confi- 
dence interval of level 1 — a for A = Ai — A2 because 

P{Li{X, Y)<A)> P{L{X) < \i,U{Y) > A2) > {VT^f = 1 - a. 

It is clear that Li{x,y) satisfies (1) and (2) introduced at the beginning 
of Section 3 because L{x) and U (y) are both increasing functions. Now we 
improve [Li {X, y), -|-oo) by constructing Lg{X, Y) following Theorem 4 and 
Remark 4. 

To illustrate the procedure, suppose {X,Y) = (4,2) is observed and a = 
0.05. Following (31) and (32), we obtain L(4) = 1.094 and U{2) = 7.208, 
respectively, and Li(4,2) = —6.114. We need to determine Lg{4:,2) given in 
(30). Consider a subset Sj of S on which Li(x, y) is no smaller than Li(4, 2), 
that is, 

S, ={{x,y)eS: Li{x, y) > Li(4, 2)} = {{x,y) e S : x > g{y)}, 
where, for each y, 

g{y) = min{x > : Li (x, y) > Li (4, 2)}. 

For example, g{0) = g{l) = 0, and g{3) = 7. Plug Sj into (28) and solve 
Lg(4,2) = -4.744 following (29) and (30). Lg(4,2) is much larger than 
Li(4, 2), as is well expected. 
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6. Discussion. In this paper, we discuss how to derive the smallest con- 
fidence interval under an ordered partition for a parameter in the presence 
of nuisance parameter(s) when the sample space is discrete. The interval 
construction is based on a direct analysis on the coverage probability, and 
needs an ordering on the sample points and mild assumptions (e.g., a discrete 
sample space) on the underlying distribution. The set inclusion criterion is 
employed for searching for good intervals because it has a clear interpreta- 
tion. Under this criterion, the smallest interval is the best in the strongest 
sense provided its existence. It is well known that the existence of the best 
interval depends on the class of intervals from which the best is searched. 
We successfully characterize such classes by (a) considering one-sided 1 — a 
confidence intervals and (b) requiring an ordering on the random confidence 
limits. Bol'shev and Loginov (1966) did not construct the interval under (b), 
so their method typically does not generate the smallest interval, while ours 
does. Another application of the proposed method is to identify admissible 
confidence intervals more efficiently. As an example, consider the case in 
Section 2. Let [Ls^ci^,Y),l] be the smallest 1 — a confidence interval for 
Pi — Po corresponding to a partition of S. Then the class of 1 — a 

confidence intervals, 

V = { [Ls,c (X, y ) , 1] : V partition of S} , 

is complete since for any 1 — a confidence interval [L{X,Y), 1] there exists 
an interval [Ls^ci^^Y),!] in V so that [Ls^c{x,y),l] is always a subset of 
[L(x,y),l] for any (x,y) G S. Although class D is not minimal, it contains 
finitely many elements. One only needs to search for optimal intervals from 
T> because it is complete. Furthermore, one can apply Proposition 2 and 
conditions (1) and (2) in Section 3 to search for optimal intervals from a 
much smaller subset, also complete in Bb, of P. 
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